Statistical disclosure control in tabular data
نویسنده
چکیده
Data disseminated by National Statistical Agencies (NSAs) can be classified as either microdata or tabular data. Tabular data is obtained from microdata by crossing one or more categorical variables. Although cell tables provide aggregated information, they also need to be protected. This chapter is a short introduction to tabular data protection. It contains three main sections. The first one shows the different types of tables that can be obtained, and how they are modeled. The second describes the practical rules for detection of sensitive cells that are used by NSAs. Finally, an overview of protection methods is provided, with a particular focus on two of them: “cell suppression problem” and “controlled tabular adjustment”.
منابع مشابه
Maximum Utility-Minimum Information Loss Table Server Design for Statistical Disclosure Control of Tabular Data
Statistical agencies typically serve a diverse group of end users with varying information needs. Accommodating the conflicting needs for information in combination with stringent rules for statistical disclosure limitation (SDL) of statistical information creates a special challenge. We provide a generic table server design for SDL of tabular data to meet this challenge. Our table server desig...
متن کاملInformation-Theoretic Disclosure Risk Measures in Statistical Disclosure Control of Tabular Data
Statistical database protection is a part of information security which tries to prevent published statistical information (tables, individual records) from disclosing the contribution of specific respondents. This paper shows how to use information-theoretic concepts to measure disclosure risk for tabular data. The proposed disclosure risk measure is compatible with a broad class of disclosure...
متن کاملTabular Statistical Disclosure Control: Optimization Techniques in Suppression and Controlled Tabular Adjustment1
The problem of disseminating tabular data such that the amount of information provided satisfies the public need while protecting individually identifiable data is a problem in all governmental statistical agencies. The problem falls into the category of Statistical Disclosure Control and provides many difficult policy and technical challenges for these agencies. In order to achieve the double ...
متن کاملA posteriori Disclosure Risk Measure for Tabular Data Based on Conditional Entropy∗
Statistical database protection, also known as Statistical Disclosure Control (SDC), is a part of information security which tries to prevent published statistical information (tables, individual records) from disclosing the contribution of specific respondents. This paper deals with the assessment of the disclosure risk associated to the release of tabular data. So-called sensitivity rules are...
متن کاملOn Assessing the Disclosure Risk of Controlled Adjustment Methods for Statistical Tabular Data
Minimum distance controlled tabular adjustment is a recent perturbative approach for statistical disclosure control in tabular data. Given a table to be protected, it looks for the closest safe table, using some particular distance. Controlled adjustment is known to provide high data utility. However, the disclosure risk has only been partially analyzed using theoretical results from optimizati...
متن کامل